Skip to content

Conversation

@pkoutsovasilis
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis commented Nov 20, 2025

Overview

This PR implements support for multiple StackConfigPolicies (SCPs) targeting the same Elasticsearch cluster or Kibana instance, using a weight-based priority system for deterministic policy composition.

Key Features

Weight-Based Priority System

  • Policies are merged in order of weight (lower weight takes precedence)
  • Default weight: 0

Conflict Detection

Conflicts are detected across multiple dimensions and will prevent policy application:

Conflict Type Condition Result
Weight Conflict Two or more policies with identical weights target the same Elasticsearch/Kibana ❌ Conflict
SecretMount Name Conflict Different policies define SecretMount with same SecretName ❌ Conflict
SecretMount Path Conflict Different policies define SecretMount with same MountPath ❌ Conflict
Different Weights Policies have different weights and none of the above applies ✅ Pass - lower weight wins

Important: Even if two policies with the same weight have non-overlapping resources, they still conflict because the weight collision makes the merge order ambiguous.

Configuration Merging Behaviour

Different merge strategies are applied based on the configuration type:

  • Deep Merge (recursive merging):

    • ClusterSettings
    • Config
    • SnapshotLifecyclePolicies
    • SecurityRoleMappings
    • IndexLifecyclePolicies
    • IngestPipelines
    • IndexTemplates.ComposableIndexTemplates
    • IndexTemplates.ComponentTemplates
  • Top-Level Key Replacement (entire keys replaced):

    • SnapshotRepositories - each repository configuration is treated atomically
  • Union Merge (with conflict detection):

    • SecretMounts - conflicts on duplicate SecretName OR duplicate MountPath
    • SecureSettings - merges by SecretName+Key, lower weight wins (no conflicts)

Multi-Soft-Owner Secret Management

File Settings and Policy Config Secrets:

  • Now support multiple soft owners
  • Secrets are only deleted when all referencing soft-owners are removed
  • Uses eck.k8s.elastic.co/owner-refs annotation with JSON-encoded map of owner namespaced names

Secret Sources:

  • Remain single soft owner (existing behaviour unchanged)

This prevents secret leakage while enabling proper cleanup when policies are deleted.

Related Issues

@pkoutsovasilis pkoutsovasilis self-assigned this Nov 20, 2025
@pkoutsovasilis pkoutsovasilis added the >enhancement Enhancement of existing functionality label Nov 20, 2025
@prodsecmachine
Copy link
Collaborator

prodsecmachine commented Nov 20, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

🔍 Preview links for changed docs

@kvalliyurnatt
Copy link
Contributor

Just a general question/comment that came to my mind, do we want to have any limits on the number of SCPs that can be associated to a cluster ?
I was just trying to think what potential issues we might face when we have a large number of SCPs(if that is even a practical scenario) associated with a single ES cluster and if there is a practical maximum that we can enforce. One thing that came to my mind while thinking about scale was the annotation for soft owners, maybe we run into some kind of limit with the annotation map size? (I think the annotation limit is 256KB which I think should not be something to worry about ?)
Wondering if there are any other such things to consider.

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass, just looking at the code. I have not tested it yet. Will try to find some more time later today.

@pkoutsovasilis
Copy link
Contributor Author

buildkite test this -f p=gke,t=TestStackConfigPolicy*

@pkoutsovasilis
Copy link
Contributor Author

buildkite test this -f p=gke,t=TestStackConfigPolicy*

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work!

I think we have two follow-up items:

  1. improve the error/change attribution

The problem mentioned by @barkbay earlier, is worse for errors that are displayed in the status resource for any contributing policy and we currently leave it up to the user to trace back from which source it came. Can you maybe raise an issue for that?

NAMESPACE        NAME                        READY   PHASE             AGE   WEIGHT
elastic-system   elasticsearch-only-policy   1/2     ApplyingChanges   12m   0
elastic-system   kibana-only-policy          1/2     ApplyingChanges   12m   9
  1. documentation (needs to go into the docs-content repo)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement Enhancement of existing functionality v3.3.0 (next)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multiple StackConfigPolicies per cluster

5 participants